Datapoint Attributes
The Datapoint Attributes feature is a newly implemented concept in the Energyworx platform designed to enhance the efficiency of data storage and retrieval. This feature is devised to augment or replace the existing Annotation concept, which has displayed limitations in performance, particularly in terms of both front-end and back-end computational speed. Datapoint Attributes allow users to attach more detailed information to individual data points in a manner that is computationally more efficient and reduces the storage footprint.
Applicability
This feature is particularly beneficial in use-cases where extensive information is stored and replicated for each data read. This results in tangible improvements in both User Interface (UI) responsiveness and backend performance.
Usage
Datapoint attributes can be configured via API v1 and v2, and also through the front end in the new console.
Note: For configuring DPAs in the new console, beta features must be enabled.
Conceptual Framework
Data Points and Attributes
Data Point : A single unit of information, which could be a sensor reading, a measurement, or a calculated value.
Attribute : A metadata tag or descriptor that can be associated with a data point to provide additional information or context.
Attribute-Data Point Association
Each data point can be associated with one or more attributes. Unlike annotations, which use a dictionary-based data structure, the attributes use a more optimized data structure, thereby reducing the storage overhead and facilitating quicker data access operations.
Working with Datapoint Attributes in the Rules Framework
Annotation-Based Exposure
In the case of annotations, values are stored in dictionaries and exposed to rules as illustrated:
2023-01-01T00:00:00 5.0 {“some_rule“: “some_value“, “some_other_key“: “value_2“}
2023-01-01T00:15:00 3.4 {“some_rule“: “some_other_value“, “some_other_key“: “value_3“}
New Datapoint-Attribute-Based Exposure
With Datapoint Attributes, the timeseries data is exposed as follows:
2023-01-01T00:00:00 5.0 “some_value” “value_2“
2023-01-01T00:15:00 3.4 “some_other_value” “value_3“
Loading DPAs in a rule
To load DPAs in a rule, you must use the TimeseriesService. You cannot load only a DPA column, you must load the channel that a DPA is attached to. Once a channel is loaded, ALL its DPAs are loaded with it.
For example:
Channel A has DPA1 and DPA2, you need the values of DPA1. You must load channel A using the TimeseriesService, and this will load Channel A, DPA1 and DPA2.
You can find the details of how to use the TimeseriesService here
Saving DPAs in a rule
To output DPAs within a rule, add a column to the DataFrame using the naming convention <channel_classifier>.<datapoint_attribute_name>. Data stored in this column will be serialised to a Datapoint Attribute by the rule framework.
You can save this either with the store rule or with the store_timeseries method. You can find the details of how to use the store_timeseries method here
See a full example code here:
import pandas as pd
from energyworx.rules.base_rule import AbstractRule
from energyworx.domain import RuleResult
class TestRuleRuben(AbstractRule):
def apply(self, **kwargs):
# Load the data
start_date = pd.to_datetime("2023-01-01 00:00:00")
end_date = pd.to_datetime("2026-01-01 00:00:00")
load_data = self.timeseries_service.get_latest(
datasource_id=self.datasource.id,
classifiers=['ACTIVE_DELIVERY_NIGHT_BILLING'],
date_range=(start_date, end_date),
include_annotations=False,
).as_df()
# Add the dpa (make sure dpa already exists in channel classifier)
load_data['ACTIVE_DELIVERY_NIGHT_BILLING.Quality_Status'] = 'estimated'
# store the data with the new dpa
self.store_timeseries(load_data, store_in_flow_data=False)
return RuleResult()
Validation and Data Types
Validation Process
A validation will occur at the end of the rule framework, ensuring all DPA values conform to the configuration specified in the Channel Classifier.
Data Types
Bool: Should be represented in the DataFrames as boolean data type
Enums : Should be placed in the DataFrame as string values representing the enum.
Timestamps : Should be represented in the DataFrame as a DateTime object.
Floats : Should be represented in the DataFrame as float data types.
When naming an attribute or an enum (aka Symbol), keep in mind:
- Only digits, uppercase letters and underscores (_) are allowed
- We limit the length of an enum/symbol label or attribute name to 32 characters
Technical Considerations
The Datapoint Attributes feature has been designed with performance optimisation as the primary goal, particularly in terms of storage efficiency and query speed. It employs an optimised data structure and algorithms that offer higher data locality, which leads to improvements in cache coherence and hence, faster data retrieval.